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RESUME 


Dans  ce  rapport  nous  ^valuons  I'efficacit^  de  6  algorithmes 
specialises  dans  la  segmentation  de  cibles  sur  images  IR.  Ces 
algorithmes  de  segmentation,  ou  segmenteurs,  reposent  tous  sur  le 
principe  voulant  que  la  signature  thermique  d'une  cible  soit  super ieure 
k  celle  de  tout  objet  de  1 'arriere-plan.  Les  3  premiers  segmenteurs 
abordent  1* image  de  front,  en  son  entier,  tandis  que  les  3  derniers 
incorporent  la  technique  de  redressement  de  I'arribre-plan  (TRAP), 
visant  b  eiiminer  I'arribre-plan  en  tout  ou  en  partie  en  le  nivelant. 
Les  divers  algorithmes  sont  jugds  d'aprbs  a)  leur  taux  d* extract ion,  b) 
la  fidelite  du  processus  de  segmentation  en  ce  qui  concerne  les 
propridtds  gdomdtriques  des  cibles  cibles  et,finslement,  d'aprbs  c)  le 
degrd  d* individualisation  imprimd  aux  cibles  extraites  par  rapport  aux 
pseudo-cibles.  Les  3  segmenteurs  centres  sur  TRAP  ont  une  meilleure 
probability  d 'extraction  que  les  trois  autres  qui  essayent  d'dliminer 
I'arridre-plan  simplement  en  morcelant  1* image.  Par  ailleurs,  la 
plupart  des  segmenteurs  considyrds  dans  ce  rapport  altbrent  d'une  fagon 
ou  d'une  autre  la  forme  des  cibles.  Les  deux  exceptions  6  cette  rbgle 
sont  les  segmenteurs  No.  1  (gynerateur  de  silhouettes  b  seuil 
d' intensity  unique)  et  No.  6  (le  prdcydent  segmenteur  alliy  d  une 
version  particulibre  de  TRAP).  Les  rdsultats  expyrimentaux  montrent  que 
1' intensity,  le  contraste  et  la  variance  sont  les  traits  qui  permettent 
le  mieux  de  dypartager  les  cibles  et  les  pseudo-cibles.  Des  expdriences 
de  classification  rdalisyes  k  I'aide  du  segmenteur  No.  6,  lequel  s'avbre 
le  meilleur,  en  fonction  de  ces  traits  caractdristiques  indiquent  que 
I'on  peut  espdrer  obtenir  un  taux  de  ddtection  qui  excbde  90%  avec  un 
taux  de  fausses  alarmes  infdrieur  b  3%.  (NC) 
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ABSTRACT 

\/ 

This  report  presents  an  evaluation  of  the  performance  of  6 
algorithms  dedicated  to  segmentation  of  targets  in  IR  imagery.  These 
segmentation  algorithms  or  segmenters  are  based  on  the  single  assumption 
that  the  targets  display  a  larger  thermal  signature  than  the  background. 
The  first  3  segmenters  deal  with  an  image  in  its  entirety,  whereas  the 
last  3  incorporate  the  Background  Elimination  Technigue  (BET),  which 
aims  at  eliminating  wholly  or  partly  the  background  by  levelling  it. 
The  segmenters  are  judged  according  to  a)  their  extraction  rate;  b)  the 
fidelity  of  the  segmentation  with  respect  to  the  geometrical  properties 
of  the  extracted  targets;  and  c)  the  degree  of  distinctiveness  imoarted 
to  the  extracted  targets  as  opposed  to  the  nontargets.  The  3  segmenters 
relying  on  BET  have  a  better  extraction  rate  than  the  other  3  that  try 
to  cope  with  the  background  simply  by  partitioning  the  image.  Most 
segmenters  here  distort  in  one  way  or  another  the  shape  of  the  targets. 
The  two  exceptions  are  segmenter  No.  1  (Single  Intensity  Threshold 
Silhouette  Generator  or  SIT  Generator)  and  No.  6  (SIT  Generator  in 
conjunction  with  a  particular  version  of  BET).  The  experimental  results 
show  that  the  intensity,  contrast  and  variance  features  are  the  most 
effective  in  discriminating  the  targets  from  the  nontargets. 
classification  results  one  can  expect  from  these  features  together  with 
the  segmenter  that  proves  to  be  the  best  (segmenter  No.  6)  amount  to  a 


detection  rate  in  excess  of  90%  with  a  false  alarm  rate  not  greater  than 


3%.  (U) 
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i.O  INTRODUCTION 


A  previous  report  (Ref.  1)  describes  various  seqmentation 
alqorithms  developed  at  DREV  in  relation  to  tarqet  acquisition  in  IR 
imagery.  The  oresent  progress  report  evaluates  the  performance  of  these 
segmentation  algorithms. 

A  total  of  6  segmentation  algorithms  or  segmenters  are 
investigated.  First,  there  is  the  Single  Intensity  Threshold 
Silhouette  Generator.  It  is  an  early  algorithm  (Refs.  2  and  3)  that  has 
been  successful  in  detecting  targets  in  IR  BOPORS  imagery.  In  Ref.  1, 
we  demonstrate  that  one  can  use  a  thresholding  intensity  function  in 
lieu  of  a  fixed  and  global  threshold  thus  giving,  among  other 
possibilities,  the  Staircase  Intensity  Threshold  Silhouette  Generator 
and  the  Interpolated  Staircase  Intensity  Threshold  Silhouette  Generator. 
These  constitute  the  first  3  segmentation  algorithms.  The  last  3 
algorithms,  unliKe  the  aforementioned  ones,  do  not  deal  with  an  image  in 
its  entirety.  Instead,  they  try  first  to  eliminate  the  background  or, 
at  the  yery  least,  to  uniformize  it.  To  this  end,  they  incorporate  the 
Background  Elimination  Technique  (BET)  expounded  in  Ref.  1,  a  technique 
which  operates  on  a  line-by-line  basis  and  uses  a  narrow  bauidwidth 
low-pass  filter  to  assess  the  general  tendency  of  the  background  in 
order  to  subtract  it  from  the  signal  corresponding  to  a  line  of  the 
inage.  Because  of  its  real-time  implementation  potential,  we  opted  for 
a  recursive  filter  and,  more  explicitly,  for  a  4-pole  Butterworth  filter 
(Ref.  1).  Since  BET  can  be  allied  either  to  the  set  of  lines  or 
columns  of  an  image,  it  generates  2  images  referred  to  as  the  Horizontal 
Fine  Structure  image  and  the  Vertical  Fine  Structure  image  respectively. 
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One  cem  then  attempt  to  extract  targets  by  segmenting  one  of  the  fine 
structure  images,  or  a  combination  of  both,  with  the  aid  of,  say,  the 
Single  Intensity  Threshold  Silhouette  Generator.  Hie  options  retained 
are  explained  in  Sec.  2.0. 

As  the  evaluation  process  is  necessarily  based  on  some  sort  of 
imagery  its  scope  is  sometdiat  limited.  The  imagery  we  used  is  known  as 
the  Alabama  Data  3ase  and  consists  of  43  thermoscooic  images.  Hiese 
contain  tanks,  armoured  personnel  carriers,  jeeps  and,  in  one  instance, 
a  bus.  Although  they  represent  ground  scenes,  we  would  term  their 
background  as  moderately  cluttered.  On  the  other  hand,  the  images  are 
relatively  clean  and,  for  all  practical  purposes,  Ccm  probably  be 
considered  as  noise  free.  Hence,  this  imagery  constitutes  a  good  test 
wf  the  segmentation  algorithms  although  it  might  not  be  reoresentative 
of  real-life  battlefield  situations. 

The  effectiveness  of  a  particular  segmenter  is  generally 
chareKTterized: 

a)  first,  by  its  extraction  rate,  that  is,  its  ability  to 
segment  all  the  targets  present  in  the  imagery; 

b)  by  the  fidelity  of  the  segmentation  orocess  as  regards  the 

geometrical  properties  of  the  targets,  this  asoect  is 

important  to  further  discriminate  the  targets  into  classes; 

c)  and,  finally,  by  what  we  would  call  the  degree  of 
distinctiveness  introduced  among  the  segmented  objects  and, 
in  particular,  between  targets  and  nontargets. 


A 
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The  first  two  Toints  are  auite  easv  to  evaluate  since  we  know  heforehan<i 
the  exact  nuirber  of  targets  as  well  as  their  resoective  location.  The 
last  point  is  a  bit  more  tricky  for  the  segmented  obiects  cannot  be 
dissimilar  in  every  way.  So  the  problem  is  really  twofold:  determine 
the  most  discriminatory  feature  or  set  of  features  and  measure  how  well 
it  separates  the  seqmented  objects  corresnondinq  to  targets  from  those 
correspondinq  to  nontarqets.  Section  4  lists  and  defines  all  the 
features  that  were  extracted.  We  limited  ourselves  to  those  that  can  be 
extracted  seauentially  in  the  snace  of  a  sinole  mass  over  the  i^aaoe. 
The  sequential  extractor  used  is  based  on  the  Labellinq-bv-Trackim 
Alqorithn  a  short  description  of  which  is  niven  in  Sec.  5.  More 
information  about  this  extractor  can  be  found  in  Ref.  4.  The  ability  of 
a  qiven  feature  to  discriminate  between  tarqet  and  nontarqet  senmented 
objects  is  judqed  accordinq  to  the  histoqraras  of  that  feature 
respectively  for  the  tarqets  as  a  whole  and  the  nontamets  as  a  whole. 
If  both  histoqrams  peak  at  the  same  feature  value,  tlie  feature  in 
question  is  useless.  On  the  contrary,  if  the  2  histoqrams  do  not 
overlap  at  all,  that  feature  alone  is  sufficient  to  isolate  the  tarqets. 
Hence,  the  amount  of  overlapninq  is  a  measure  of  the  discrimination 
power.  We  have  qathered  together  in  Sec.  6.1  all  the  histoqrams  that 
were  determined  for  2  seqnenters  out  of  6,  and  in  Sec.  6.2  some  scatter 
plots  of  the  most  useful  features.  Section  6.3  discusses  the  oros  and 
cons  of  the  various  segmentation  algorithms  to  finally  conclude  that  the 
best  seqmenter  here  is  the  Single  Intensity  Threshold  Silhouette 
Generator  in  conjunction  with  the  arithmetic  mean  of  the  2  fine 
structure  images. 


t 
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2 . 0  iEG«NTAriON  JTO  .BE  CVWJATCD 

The  segmentation  algorithms  investigated  are: 

1)  The  Single  Intensity  Threshold  Silhouette  Generator, 
hereafter  designated  as  SIT  Generator. 

2)  The  Staircase  Intensity  Threshold  Silhouette  Generator, 
hereafter  designated  as  3CIT  Generator. 

3)  The  Interpolated  Staircase  Intensity  Threshold  Silhouette 
Generator,  hereafter  designated  as  ISCIT  Generator. 

4)  The  SIT  Generator  together  with  the  Horizontal  Fine  Structure 
image,  hereafter  designated  as  SIT  Generator  with  BET(HFS); 
BET  stands  for  Background  Elimination  fechnioue. 

5)  The  SIT  Generator  together  with  the  Maximal  Fine  Structure 
image,  hereafter  designated  as  SIT  Generator  with  BET (Max). 

6)  The  SIT  Generator  together  with  the  Mean  Fine  Structure 
image,  hereafter  designated  as  SIT  Generator  with  BET(Mean). 

we  will  also  sometimes  refer  to  these  segmentation  algorithms  as 
segmcnter  No.  followed  by  the  appropriate  number. 

The  SIT  Generator  is  an  early  algorithm  that  was  used  to  detect 


targets  in  IR  QOFOR3  imagery  (Refs.  2  and  3).  The  defining  procedure  of 
this  segmenter  as  applied  to  the  imagery  used  for  eyaluation  is: 
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a)  Divide  the  imaqe  into  16  sub-images  by  quartering  both  axes. 

b)  Determine  the  histogram  of  each  sub-image. 

c)  Find  out  the  cutoff  gray  level  of  each  oartial  histogram. 
Scanning  the  histograun  from  the  highest  bin  down,  the  cutoff 
gray  level  is  defined  as  the  gray  level  of  the  first  bin 
occupied  by  at  least  3  pixels. 

d)  Discard  the  cutoff  gray  levels  less  than  the  80th  percentile 
of  the  histogram  of  the  whole  image  and  then  choose  as  a 
global  intensity  threshold  the  smallest  of  the  remaining 
cutoff  gray  levels. 

e)  In  extremis,  if  it  ever  haooens  that  all  the  cutoff  gray 
levels  are  equal,  use  the  80th  percentile  as  a  threshold. 

The  SCIT  and  ISCIT  generators  are  variants  of  the  SIT  Generator. 

Formally,  the  defining  procedure  of  the  SCIT  generator  is; 

a)  Partition  the  image  horizontally  into  4  independant  sections. 

b)  Divide  each  section  into  4  sub- images. 

c)  Determine  the  histogram  of  each  sub-image  within  each 
section. 
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d)  For  each  section,  find  out  the  cutoff  qrav  level  of  all  the 
histograms. 

e)  For  each  section,  discard  the  cutoff  qrav  levels  less  than 
the  80th  oercentile  of  the  sectional  histogram  and  choose  as 
a  global  intensity  threshold  for  that  section  the  smallest  of 
the  remaining  cutoff  gray  levels. 

This  procedure  generates  4  discrete  intensity  thresholds  or,  if  we  olot 
the  threshold  for  each  line  of  the  image  against  the  line  number,  a 
staircase-like  discontinuous  thresholding  intensity  function.  As 
explained  in  Ref.  1  and  evidenced  in  Fig.  1,  the  3CIT  Generator  is  bound 
to  create  artifacts  whenever  the  thresholds  of  two  adiacent  sections 
differ  widely.  A  manifest  way  to  eliminate  these  artifacts  consists  in 
smoothing  the  transition  between  two  sections  by  linearly  interoolating 
the  relevant  thresholds.  The  continuous  thresholding  intensity  function 
that  results  thereof  defines  the  ISCIT  Generator.  This  generator  as 
well  as  the  SIT  and  3CIT  Generators  are  deoicted  in  Fig.  1.  Although  we 
did  not  inplement  it,  it  might  be  worthwhile  to  add  to  the  SCIT  and 
ISCIT  Generators  a  last-resort  alternative,  similar  to  e)  above,  for  the 
case  where  all  the  cutoff  gray  levels  of  a  oarticular  section  are  eoual. 

The  3  segmenters  we  will  now  outline,  unlike  the  orevious  ones,, 
do  not  deal  with  the  image  in  its  entirety.  In  fact,  they  all  include  a 
common  technique  which  aims  to  suppress  all  or  part  of  the  background. 
This  technique,  referred  to  as  SET  and  described  in  detail  in  Ref.  1, 
operates  on  a  one-dimensional  signal  (a  given  line  or  column  of  an 
image)  and  uses  a  narrow  bandwidth  low-oass  filter  to  assess  the  general 
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FIGURE  1  -  Segmenters  No.  1,  2  and  3 

A)  Image  ALA  6  3  from  the  Alabama  Data  Base 
(1:  raw;  2:  histogram  egualized; 

3:  sub- images  delineated) 

B)  Thresholding  Intensity  Functions 

C)  Segmented  Images 

1)  Single  Intensity  Threshold 

2)  Staircase  Intensity  Threshold 

3)  Interpolated  Staircase  Intensity  Threshold 


jiT I  '  ■"  *"  '  ”  '  '■  ■ 
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tendency  of  the  background  and  then  subtract  it  from  the  signal  itself. 
Because  of  its  real-time  iirplementation  txjtential,  we  opted  for  a 
recursive  infinite  impulse  response  filter  and,  to  be  more  specific,  for 
a  4-pole  Butterworth  filter  (FP3F) .  Such  a  filter  can  be  realized  (Ref. 
1)  as  a  cascctde  of  2  second-order  systems.  The  resulting  set  of  linear 
difference  equations  is: 

£j(nT)  =  X  [(n-2)T]  . 

f2(nT)  =  £j(nT)  -  bj£2l(n-l)Tl  -  b2f2[(n-2)T.]  , 

III 

£3(11?)  =  £2(111)  -  b3£3[(n-l)T]  -  b^£3[(n-2)T]  , 
y  (tiT)  =  bQ£3(nT) 

with  *^0  *  ^3  ^4^  ’ 

bi  =  -  (Zi  +  Zi  )  ,  b2  *  Zj  Zj  j2j 

bj  =  -  (Z2  *  Z2*)  .  b^  =  Z2  Z2* 

vdiere  Z.  =  exp[-2ii£  (cos  67.5°  -  j  sin  67.5  )/£  ] 

[3] 

Z_  =  exp[-2Tr£  (cos  22.5°  -  j  sin  22.5°)/£  ] 

<6  C  5 
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In  these  equations,  x  desiqnates  the  input  siqnal,  v  the  filtered  cutout 

signal,  T  the  sampling  interval,  f^  the  3-dB  cutoff  frequency  of  the 

filter  and  f  the  sampling  freouency  of  the  signal.  The  asterisk  in  [2] 
s 

denotes  the  complex  conjugate,  frequency  of  the  signal.  The  asterisk 
in  [2]  denotes  the  conplex  conjugate. 

To  illustrate  BET  we  will  use  the  signal  of  Fig.  2,  which 

corresponds  to  line  175  of  image  6  from  the  Alabama  Data  Base,  and 

assume  that  the  3-d3  normalized  cutoff  frequency  (f  /f  )  of  the  low-oass 

c  s 

FPBF  digital  filter  is  equal  to  0.01  (to  process  the  evaluation  imagery 
we  used  a  cutoff  frequency  of  0.05;  see  Ref.  1).  The  filtered  signal 
generated  by  such  a  filter  is  shown  in  Fig.  2a  along  with  the  input 
signal.  Two  points  are  worth  mentioning  about  the  filtered  signal: 

a)  There  is  a  drooo  in  the  filtered  signal  at  its  origin. 

b)  The  filtered  signal  is  shifted  to  the  right. 

The  first  anomaly  can  be  easily  corrected  by  selecting  the  initial 
conditions  so  that  there  is  no  transient  at  the  origin.  It  can  be  shown 
(Ref.  1)  that  the  required  initial  conditions  are: 


fjCnT)  =  H  / 

and 

f^CnT)  =  H  /  (1  ♦  bj  ♦  b^) 


[4] 
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FIGURE  2 


-  Background  Elimination  Technique  (BET) 
The  illustrative  signal  is  line  175  of 
image  6  from  the  Alabama  Data  Base. 

The  3  peaks  correspond  respectively 
(from  left  to  right)  to  a  tank,  an  ARC 
and  a  jeep.  The  cutoff  frequency  of 
the  filter  is  0,01. 

A)  FPBF  filter  initially  at  rest 

B)  FPBF  filter  with  nonzero  initial 
conditions;  the  solid  line  is  the 
left  filtered  signal  while  the 
dashed  line  is  the  right  filtered 
signal. 

C)  Arithmetic  mean  of  the  2  filtered 
signals 

D)  Fine  structure  or  fluctuating 
component  of  the  input  signal 
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Cor  n<U;  U  is  the  value  of  the  input  siinal  at  t=0^.  Fiq.  2b  shows  the 
filtered  signal  (solid  line)  that  results  when  we  use  these  new  initial 
conditions.  The  second  anomaly  can  be  as  easily  corrected  by  shiftinq 
the  filtered  siqnal  to  the  left.  However,  rather  than  rectifying  this 
anomaly  we  will  take  advantage  of  it  to  clip  the  peaks.  Let  us  consider 
Fiq.  2b.  The  signal  is  fed  to  the  filter  from  left  to  right.  Normally, 
we  would  expect  the  filtered  siqnal  to  peak  at,  or  close  to,  the 
position  of  the  main  spike  in  the  incut  siqnal.  Instead,  it  overshoots 
to  the  right.  Therefore,  had  the  signal  been  fed  from  right  to  left, 
the  overshoot  would  have  occurred  to  the  left  (dashed  line  in  Fiq.  2b). 
By  combining  both  filtered  signals  in  some  fashion,  we  can  expect  to  end 
uP  with  a  curve  that  will  bypass  entirely  the  peaks  to  follow  only  the 
broad  characteristics  of  the  input  signal.  Various  combinations  were 
tried  (Ref.  1).  All  things  considered,  the  arithmetic  mean  (Pig.  2c) 
was  judged  most  satisfactory.  Fig.  2d  exhibits  the  Cine  structure 
(fluctuating  component)  of  the  illustrative  siqnal,  that  is,  what  is 
left  of  the  signal  once  the  estimated  trend  of  the  background  is 
removed . 


The  Background  Elimination  Technioue  can  be  applied  either  to  the 
set  of  lines  or  columns  of  an  image  thus  producing  2  distinct  images 
(Fig.  3)  referred  to  as  the  Horizontal  Fine  Structure  (HFS)  image  and 
the  Vertical  Fine  Structure  (VFS)  image  respectively.  Although  these 
images  turn  out  to  be  highly  textured,  they  do  not  exhibit,  unlike  the 
parent  image,  large-scale  fluctuations.  This  is  important  for 
large-scale  fluctuations  may  easily  fool  a  segmenter  like  the  SIT 
Generator  based  on  the  single  assumption  that  the  targets  present  a 
larger  thermal  signature  than  the  background.  The  SCIT  and  ISCIT 
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generators  do  try  to  circunvent  the  problem  by  slicing  the  image  into 
sections  whose  background  may  be  considered  as  "uniform",  but  then  the 
crux  centers  on  the  manner  in  v^ich  the  sections  are  defined.  With  the 
2  fine  structure  images,  this  crucial  question  does  not  arise  because 
their  background,  on  a  large-scale  basis,  is  inherently  uniform.  In 
consequence,  the  SIT  Generator  should  be  well  suited  for  thresholding 
the  fine  structure  images.  The  veracity  of  this  affirmation  is 
confirmed  by  the  results  of  Figs.  3d  and  3e.  The  procedure  leading  to 
Fig.  3d  defines  segmenter  No.  4:  SIT  Generator  with  BET  (HFS). 

The  segmentation  of  both  the  HFS  image  (Fig.  3d)  and  the  VFS 
inage  (Fig.  3e)  results  in  targets  whose  shape  is  slightly  distorted. 
However,  since  the  distortion  is  more  outstanding  in  one  direction  than 
in  the  other,  and  since  the  dimension  affected  is  different  whether  HFS 
or  VFS  is  involved,  it  should  be  possible  to  maintain  intact  the  shaoe 
of  the  targets  by  thresholding  a  joined  image  resulting  from  some 
combination  of  HFS  and  VFS.  Hie  following  sensible  conbinations  were 
formed: 

a)  Maximal  Fine  Structure  image,  where  the  value  at  any  given 
location  corresponds  to  the  maximum  of  HFS  and  VFS  for  that 
location. 

b)  Mean  Fine  Structure  image,  where  the  value  at  any  given 
location  corresponds  to  the  arithmetic  mean  of  HFS  and  VFS 
for  that  location. 
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FIGURE  3 


I 


-  Thresholding  of  the  fine  structure 
images  derived  from  image  6  3  of 
the  Alabama  Data  Base: 

a)  Original  histogram  -  egualized 
image 

b)  Horizontal  Fine  Structure  (HFS)  image 

c)  Vertical  Fine  Structure  (VFS)  image 

d)  Segmented  image  generated  by 
thresholding  HFS  with  the  SIT 
Generator;  this  defines  segmenter 
No.  4:  SIT  Generator  with  BET  (HFS). 

e)  Segmented  image  generated  by 
thresholding  VFS  with  the  SIT 
Generator 

The  images  b  and  c  were  postprocessed , 
for  display  purpose,  first  by  adding 
a  constant  bias,  so  as  to  remove 
negative  gray  levels,  and  then  by 
stretching  the  gray  levels  bounded 
by  the  5th  and  95th  percentiles 
linearly  over  the  display  range. 
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■nie  first  combination  defines  seqmenter  No.  5  (SIT  Generator  with 
BET (Max) )  and  the  second  one  segmenter  No.  6  (SIT  Generator  with 
BET(Mean)) . 

3.0  IMAGERY  USED  FOR  EVALUATION 

To  reliably  evaluate  the  performance  of  a  particular  seqmenter, 
we  need  some  sort  of  imagery  to  start  with.  However,  this  very  fact 
somewhat  limits  the  scope  of  the  evaluation  to  a  certain  type  of 
background,  noise,  image  quality  etc.  The  imagery  used  here  for 
evaluation  is  known  as  the  Alabama  Data  Base  and  consists  of  43 
thermoscopic  images.  The  spectral  region  of  the  majority  of  them  (30 
out  of  43)  corresponds  to  the  8-14pm  band,  and  that  of  the  remaining 
ones  to  the  3-5um  band.  Altogether  the  images  contain  85  targets,  some 
of  them  so  close  to  each  other  as  to  form  a  distinct  entity,  distributed 
as  follows:  40  tanks,  29  armoured  personnel  carriers  (APC) ,  15  jeeps 
and,  finally,  a  bus.  Ihe  number  of  targets  in  a  single  image  never 
exceeds  3  and  no  image  contains  2  targets  of  the  same  type.  Although 
the  images  represent  ground  scenes,  we  would  term  their  background  as 
moderately  cluttered.  On  the  other  hand,  the  images  are  relatively 
clean  and,  for  all  practical  purposes,  can  probably  be  considered  as 
noise  free.  The  size  of  the  images  is  420  x  335  pixels  and  they  are 
digitized  according  to  a  256-level  grayscale.  The  images  were  in  no  way 
preprocessed  prior  to  segmentation  but,  for  display  purpose  (e.g.  Figs. 
1  and  3),  they  were  postprocessed  by  histogram  equalization,  which 
almost  consistently  yields  "good-looking"  images. 
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4 . 0  POUNDATIO^«  OF  THE  EVALUATION  JPl^ESS 

An  overview  of  the  scientific  literature  devoted  to  automatic 
target  acquisition  would  reveal  that  the  effectiveness  of  a  particular 
segmenter  is  generally  diaracterized 

a)  first,  by  its  extraction  rate,  that  is,  its  ability  to 

segment  all  the  targets  present  in  the  imagery; 

b)  by  the  fidelity  of  the  segmentation  as  regards  the 

geometrical  properties  of  the  targets;  this  asoect  is 

inportant  to  further  discriminate  the  targets  into  classes; 

c)  and,  finally,  by  what  we  would  call  the  degree  of 

distinctiveness  introduced  among  the  segmented  objects  and, 
in  particular,  between  targets  and  nontarqets. 

The  first  two  points  are  quite  easy  to  evaluate  since  we  know  beforehand 
the  exact  number  of  targets  as  well  as  their  respective  location.  The 
last  point  is  a  bit  more  trickv  for  the  segmented  objects  cannot  be 
dissimilar  in  every  way.  So  the  problem  is  really  twofold;  determine 
the  most  discriminatory  feature  or  set  of  features  and  measure  how  well 
it  separates  the  segmented  objects  corresoonding  to  targets  from  those 
corresponding  to  nontargets.  In  this  section,  we  list  and  define  all 
the  candidate  features.  Vife  limited  ourselves  to  those  that  can  be 
extracted  sequentially  in  the  space  of  a  single  pass  over  the  image. 
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Je  will  first  define  a  ouantitv  that  aooears  in  nanv  exoressicMis 

below.  Given  an  arbitrary  sehinentevd  obiect  3,  t'le  (n,a)  th  laoment  m 

n,a 

of  3  is  defined  as 


p.q 


0.1,2 


where  1  denotes  the  number  of  tx)int3  in  sunnation,  x  and  v  the  soatial 
coordinates,  and  where  the  value  of  n  at  any  ooint  (x,v)  is  oro’^ortional 
to  the  briohtness  (or  qray  level)  of  3  at  that  ooint.  We  can  now 
proceed  with  the  definition  of  the  features  involved  in  the  evaluation 
process. 


1)  ^te^a(A)_  -  The  area  of  3  is  lust  the  number  of  ooints  in  3: 

A  =  N  161 


2)  Perimeter (P)  -  Rosenfeld  and  Kak  (Ref.  5)  qive  4  oossible 
definitions  of  the  neriiTieter: 

a)  The  number  of  pairs  of  ooints  (u,v)  with  n  in  3  and  v  not 
in  3, 

b)  The  number  of  steos  taken  'av  a  border-followim  alaorithm 
in  followinq  all  the  borders  of  3, 


c)  The  same,  but  with  diaqonal  stems  countinq  ^  2  each , 
while  horizontal  and  vertical  steos  count  only  1  each. 
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rl)  1*10  mn’ior  of  border  ooint”  of  3. 

We  have  adooted  the  first  definition  for  it  can  be  easily  comouted 
sequentially  and  also  because  it  yields  the  riqht  answer  for  a  square. 
It  should  oe  emphasized  here  that  not  only  the  perimeter  of  the  outside 
border  but  that  of  all  the  borders  of  the  segmented  object  are  comouted. 

3)  '^inness  _RatioCr)_  -  The  thinness  ratio  of  a  segmented  object 
yjt  area  A  and  perimeter  P  is  defined  (Ref.  6)  by 


It  can  be  shown  that  T  has  a  maxiriKim  value  of  1,  which  it  achieves  if 
the  segmented  object  in  question  is  circular.  Loosely  speaking,  the 
fatter  a  segmented  object  is  the  greater  will  be  the  associated  thinness 
ratio;  conversely,  line-like  or  largely  perforated  objects  will  have  a 
thinness  ratio  close  to  zero.  Moreover,  the  thinness  ratio  is 
dimensionless  and  hence  depends  only  on  the  shape  (but  not  the  scale)  of 
the  segmented  object. 

4)  Average_Liten^^(3)_  -  The  average  intensity  or  brightness  is 
a  function  of  the  average  temperature  of  the  underlying  object  and  is 
defined  as 


B  =  m, 


0,0 


[81 
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5)  Average  Contrast (C)  -  Contrast  has  many  definitions.  The  one 
used  here  is 

C  =  (B^  -  B^)/B|^  [9] 

where  B^  refers  to  the  brightness  of  the  segmented  object  and  B|^  to 
that  of  the  background  irrmediately  surrounding  the  object.  We  already 
know  how  to  coirpute  B^.  We  can  conpute  Bj^  in  the  same  way  but,  then,  we 
have  to  specify  what  the  background  S  of  S  is.  Obviously,  3  should 
not  include  S  or  another  segmented  object.  Moreover,  it  should  not 
stretch  out  too  far  away  from  S  in  order  that  C  be  a  local  measure  of 
contrast.  These  reguirements  are  easily  met  by  -extending  (Fig.  4)  in 
both  directions  independently  all  the  runs  of  segmented  pixels  present 
on  any  given  scan  line.  The  extension  of  one  end  of  a  run  proceeds 
until  another  run  of  segmented  pixels  is  hit,  or  until  the  run  has  been 
extended  by  half  its  length.  In  this  way,  the  area  of  S  is  about  the 
same  as  that  of  S.  Although  this  scheme  may  distort  the  measurement  of 
C  by  introducing  a  certain  degree  of  directionality,  this  is  not  a  major 
drawback  for  the  segmented  objects  more  often  resemble  objects  1  and  2 
in  Fig.  4  than  objects  3,  4  or  5  v^ere  the  effect  of  directionality  is 
most  damaging.  On  the  other  hand,  an  important  asset  of  this  scheme  is 
its  ease  of  inplementation. 

6)  Relative  Intensity (B ' )  -  Hie  relative  intensity  is  obtained 


by  mapping  the  average  intensity  of  a  segmented  object  within  a  given 
frame  into  a  scale  from  0  to  1,  that  is,  a  scale  independent  of  the 
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FIGURE  4  -  The  pixels  used  to  compute  the  brightness  of  B  of  the 
background  of  the  5  numbered  objects  are  marked  with  O's. 
The  pixels  marked  with  fl's  enter  into  the  computation  of  2 
different  B,  'a. 
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overall  characteristics  of  the  imaqery.  We  used  to  define  the  relative 
intensity  as 


B' .  =  (B.  -  B  .  )/(B  -  B  .  ) 

1  1  min  max  min 


i  =  1,2 


[10] 


v^ere  M  is  the  number  of  segmented  objects  within  the  frame  in  question 

and  B  .  and  B  are  respectively  the  largest  and  the  smallest  of  all 
min  max 

the  B^'s.  However,  this  expression  is  not  flawless.  For  example,  if 
M  =  2,  the  relative  intensity  of  one  segmented  object  will  be  1  and  that 
of  the  other  0  regardless  of  their  respective  brightness.  From  the 
standpoint  of  discrimination,  this  is  not  desirable  for  it  might  well 
occur  that  both  segmented  objects  are  targets.  In  fact,  the  same 
situation  is  bound  to  happen  each  time  the  number  of  segmented  objects 
is  equal  to  or  less  than  the  number  of  expected  targets.  So,  when  the 
number  of  segmented  objects  is  small,  in  our  case  small  means  less  than 
6  (this  number  is  greater  than  the  maximum  number  of  targets  one  can 
find  in  any  image  of  the  Alabama  Data  Base  in  order  to  account  for 
multiply  segmented  targets) ,  we  use  the  following  expression  instead  of 
[91: 


B'.  =  B./B  i  =  1,2,...  M<6.  [11] 

1  1  max 

One  may  wonder  why  [11]  alone  is  not  used.  It  is  simply  because 
experimentation  shows  that  in  most  cases  [10]  better  discriminates  the 
targets  from  the  non tar gets. 

7)  Relative  Contrast (C*)  -  The  relative  contrast  is  defined  as 
the  relative  intensity. 
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3)  Centroid (x,y)  -  The  centroid  of  a  blob  (segmented  obiect)  is 


the  point  (x,y)  whose  coordinates  are  given  by 

X  =  Q  /  , 

ind  ^  [^2] 

y  ^  "*0,1  ^  "\),0  * 

9)  Prmcio^  Axis(  0 )  -  The  principal  axis  0  is  the  angle  for 
which  the  moment  of  inertia  of  a  blob  about  a  line  through  its  centroid 
is  as  small  as  possible.  It  can  be  shown  that 


_  arctan  /  (M^  ^  ^ 


#here 


M  =m  /m  .-x^y*^. 

p.q  p.q  0,0 


The  M  's  are  nothing  but  central  moments,  that  is,  the  above  moments 
p»q[ 

evaluated  around  (x,y)  as  the  origin. 


10)  Overall  Width(^)  -  The  overall  width  of  a  blob  is  defined 


where  x  is  the  abscissa  of  the  lower  right  corner  of  the  smallest 

R 

rectangle  circumscribed  around  the  blob  and  x  that  of  the  upper  left 

Li 

corner  of  the  same  rectangle. 
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11)  Overall  Height (h)  -  The  overall  height  is  sioply  the 

difference  between  the  ordinates  corresponding  to  and  x^: 

h  =  Vr  -  •  ^^5] 


12)  i/h  Ratio 

13)  Bulkiness (e)  -  The  bulkiness  is  the  proportion  of  the 
circumscribed  rectangle  occupied  by  the  blob: 

e  =  A/(ixh)  .  [16] 


14)  Major  and  Minor  Diameters (d^^yd^)  -  The  eigenvalues  of  the 


matrix 


2,0 

"i.i' 

‘l.l 

”0.2 

of  second  central  moments  are 

2 


■  %,2  *  "l.l  ' 


■^2  ■  ”2,0  •  "l.l  ' 


tl71 


These  eigenvalues  are  the  principal  moments  of  inertia  of  the  blob  and 
it  Cem  be  shown  that  the  larger  eigenvalue  corresponds  to  the  principal 
etxis.  The  major  and  minor  diameters  are  then  defined  as 


dj  =  1  2rj  and  d^  =  1  2t^  . 


[18] 
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15)  Aspect  Ratio (a)  -  The  aspect  ratio  is  qiven  by 

a  =  dj  /  d2  (19) 

where  denotes  the  maior  diameter.  This  quantity  is  then  always 
greater  than  (or  possibly  equal  to)  1. 


The  last  8  features  aim  at  characterizing  the  statistics  of  S  (segmented 
object)  and  S  (backgrounil  of  S) .  A  complete  specification  of  the 
statistics  of  say  S  is  possible  if  one  knows  its  moments  m^^  (there 
should  be  no  confusion  between  m^  ^  and  m^  since  the  latter  has  only  one 
index)  defined  by 


=  E  {n(x,y)*^}  =  ^  ^  . 

S 


[20] 


Clearly, 


mj  =  B 


the  brightness  or  the  average  intensity  of  S.  The  corresponding  central 
moments  are  given  by 


=  E  {(n(x.y)  -  B)*"}  =  ^  (n(x,y)-B)' 


[21] 


These  can  be  expressed  in  terms  of  mj^: 


k! 


r  „r 


\"^r!(k-r)!  ®  ""k-r  ' 


[22] 
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In  particular 

2 

m2  =  m2  -  B  ,  [23] 

-  3  3m2  +  2  ,  [24] 

M  =  m.  -  4  3m,  +  6  3^  m,  -  3  .  [25] 

4  4  3  2 

Given  these  expressions  we  readily  obtain: 

=  M2  .  [26] 

16)  Blob  Variance  (a^) 


17)  Blob _ Relative  Variance  -  Tl'.is  quantity  is  defined  as  the 

relative  intensity. 

18 )  Bl^b  _S  kejmess 

M3  /  [27] 

19 )  Blob  _Kur tos is 

M4  /  o'*  -  3  [28] 

and  their  counterpart  as  reqards  3, 


Mr 
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20)  Background  Variance 

2 1 )  Ssc^k  g  r_ound  _Re  1  a  t  ij/e  _Y,a  r  iance 

22)  Background  Skevvness 


23l  Background  Xortosic 
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5.0  SEQUENTIAL  FEATURE  EXTRACTOR 

In  this  section  we  briefly  describe  (Tiore  information  can  be 
found  in  Ref.  4)  the  sequential  algorithm  used  to  extract  from  a 
segmented  image  the  various  features  listed  in  the  preceding  section. 
Such  an  algorithm  is  called  a  feature  extractor  and  :  car.  b"-  regarded 
an  err.uiatLor  c  a  hvootnet ical  Hardware  un:.t  o 
meardm  i.'  reflected  .n  the  terminoee  ■ 
ccr  -i. .  T-':  r-aa:..  orecjuure  of  tnt  e>,  ■ 

so.TietiiTiei  referred  to  in  r.n:;  scientific  literature  as  tne 

Labellinq-by-Tracking  (LT)  Algorithm  (Ref.  5).  References  4  and  8 
describe  a  more  complex  extractor,  the  Boundary  Continuation  Algorithm, 
that  can  also  fulfill  the  same  task.  Given  a  thresholding  intensity 
function  of  the  kind  defined  in  Sec.  2,  both  extractors  can  segment  the 
image,  identify  the  objects  generated,  and  extract  the  relevant  features 
in  a  single  image  scan. 

The  memory  of  the  LT-extractor  consists  of  a  scan  line  array  plus 
a  feature  array.  The  first  array  contains  the  current  scan  line  of  the 
image  being  processed  as  well  as  the  immediately  preceding  the  scan 
line.  It  is  initially  set  to  zero  and  afterwards  updated  bv  replacing 
the  precedent  scan  line  by  the  current  one  and  reading  in  the  subsequent 
one.  Ihe  scan  line  array  is  equivalent  to  viewing  the  image  through  a 
downward  moving  slit  whose  width  matches  that  of  the  image,  but  has  only 
2  pixels  in  height.  The  feature  array  has  an  arbitrary  nuntoer  of  lines 
whereas  the  number  of  coluims  is  a  function  of  the  number  of  features  to 
be  extracted.  The  nuittoer  of  columns  is  not  exactly  equal  to  the  nuntoer 
of  features  because  some  of  these  are  nothing  but  a  combination  of  other 
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features  (e.q.  the  aspect  ratio),  while,  on  the  contrary,  it  is 
necessary  to  accumulate  more  than  one  quantity  to  determine  other 
features  (e.q.  the  orincipal  axis).  On  the  other  hand,  there  are  no 
fixed  rules  as  reqards  the  number  of  lines.  We  can  sav,  for  sure,  that 
it  should  at  least  Ije  equal  to  the  maximum  number  of  expected  tarqets 
and  nontarqets  in  a  sinqle  frame  but,  in  Practice,  it  should  be  chosen 
larqer  than  that  at  some  point  in  the  extraction  process  an  obiect 
av  te.'poro'iri  '  “  lit  up  into  several  ccmoonent.' .  These  coroonents 


feat::,  a  arrs"  .  "  riot  l.arqe  enosan,  rite  sxtiaction  process  will 


icessar  1  ]  '  fse  ooeded  but  it  miqht  be  slowed  down  acoreciablv  r>acause 
of  frequent  uodatC'C  (Kef.  4).  One  column  of  tne  feature  arrav  is  set 
apart  for  a  substitution  table  that  keeos  track  of  all  the  components  of 
the  various  objects.  This  table  is  the  kev  for  uodatinq  the  feature 
array  (Ref.  4). 

We  detail  hereafter  the  procedure  used  by  the  LT-extractor  to 
identify  the  objects  qenerated  by  the  seqmentation  process.  The 
identification  is  done  by  labellinq  the  various  objects,  that  is, 
assiqninq  a  specific  nunfcer  to  each  of  them.  It  should  be  obvious  from 
this  procedure  that  the  pixels  belonqinq  to  an  object  are  assumed  to  be 
8-connected  (Refs.  4  and  5).  Let  n^  (j)  be  the  label  assiqned  to  the 
jth  pixel  (Fiq.  5)  of  scan  line  i  (even  thouah  both  use  the  same  letter 
it  is  unlikely  to  confuse  the  qrav  level  n{x,v)  with  the  label  n^(j)) 
and  n^  the  last  nurrber  utilized  to  label  a  new  object.  Furthermore,  let 
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•/  -  1  •/ 


j'  Z  k  ^  L  [  V  -i’  /  -  1 


3:A'^  1  : 


FIGURE  5  -  a)  Fraction  of  the  scan  line  array  with  the  current 
pixel  (i,j)  marked  with  an  asterik 
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us  assume  that  all  the  oixels  belonging  to  the  background  are  set  to 
zero.  Then, 

X  n^_j^(j+l)  0 

and 

n^_j^(j-l)  n^_^(j+l) 

we  conclude  that  the  relevant  pixels  are  connected  through  the  diagonal, 
and  consequently  that  they  belong  to  the  same  object.  We  determine 
which  label  has  the  greatest  value,  say  n^_^(j-l),  and  reolace  in  the 
substitution  table  all  the  n^  ^ 


Next, 

we  put 

nj^(j) 

if 

nj^_j^(j-l)  ^  0;  otherwise 

n.  (j) 

■ 

if 

ni-i(j)  f  0;  otherwise 

n.{j) 

if 

n^_j^(j+l)  /  0?  otherwise 

n^(j) 

=  n.(j-l) 

if 

n^(j-l)  /  0;  otherwise 

we  conclude  that  the  pixel  (i,j)  is  not  the  continuation  of  an  existing 
object  but  the  beginning  of  a  new  one.  We  can  then  set 

3)  n^(j)  =  n^  +1 

but  in  this  way  all  the  objects,  whatever  their  size,  will  be  labelled, 
that  is,  even  1-pixel  and  2-pixel  objects.  However,  these  objects  are 
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obviously  (here)  nontarqets  and  labelling  them  unnecessarily  overloads 
the  feature  array  and,  in  some  instances,  can  possibly  saturate  it.  It 
is  then  preferable  to  eliminate  them  at  once.  To  this  end,  it  suffices 
to  replace  the  preceding  step  3)  by 

3') 

n^(j)  =  0  if  n^(j+l)  =  0;  otherwise 

ni(j)  =  nj^(j+l)  =  n^_^(j+2)  if  n^_j^(j+2)  ^  0;  otherwise 
ttiCj)  =  n^(j+l)  =  0  if  n^(j+2)  =  0;  otherwise 

we  conclude  that  the  pixels  (i,j)  and  (i,j+l)  belong  to  a  new  object  and 
we  set 

ni(j)  =  n^{j+l)  =  n^+1. 

It  is  worth  noting  that  this  step  also  eliminates  slanted  lines 
(lines  parallel  to  the  scan  direction  remain  but  they  are  eliminated 
later  when  the  feature  array  is  updated)  whose  width  is  less  than  3 
pixels  as  well  as  line-like  object  protuberances  jutting  out  counter  to 
the  scan  direction.  This  might  be  a  source  of  distortion  of  the  object 
shape,  but  probably  not  a  serious  one  considering  that  the  boundary  of 
the  targets  is  in  general  relatively  smooth  (this  is  true  mainlv  because 
the  targets  are  small  and  line-like  features  such  as  a  tank's  gun  are 
unresolved) . 

4)  If  n.(j-l)  ^  0 

and 

n^ (j-l)  ^  n^(j) 


we  are  faced  with  2  object  segments  connected  by  one  end.  This  oiece  of 
information  is  entered  into  the  substitution  table  as  described  in  1) . 
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Given  that  the  nuniber  of  lines  of  the  feature  array  is  fixed,  it 
might  well  happen  that  the  extraction  process  will  have  to  be  halted 
despite  repeated  updating  operations  because  the  feature  array  is  full. 
To  reduce  the  number  of  updates  and  to  prevent  the  feature  array  from 
saturating,  the  above  procedure  can  be  modified  so  as  to  utilize  the 
smallest  number  of  labels  to  identify  the  objects.  Let  J  =  j-1  and 
be  the  number  of  pixels  labelled  n^(j).  Then  4)  is  replaced  by: 


4')  If  n^(J)  0  and  n^(J)  ^  n^(j)  set 

L(j)  =  L{j)  +  1, 

L(J)  =  L(J)  -  1, 
n. (J)  =  n^(j) , 

and  we  repeat  for  J  =  J-1  if  n^(J-l)  ^  0;  otherwise  we  out 

n  =  n  -1  if  L(J)  =  0;  otherwise 
o  o 

we  conclude  that  there  exists  on  scan  line  i-1  an  object  segment 
labelled  n^ (J)  that  belongs  to  the  same  object  as  the  segment  n^ ( J) .  We 
then  modify  the  substitution  table  accordingly.  This  simple  step 
frequently  allows. us  to  save  a  label  (Ref.  4). 


5)  Repeat  from  1)  with  the  next  pixel  different  from  0. 
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6.0  EXPERIMENTAL  RESULTS 


We  mentioned  in  Sec.  4  that  one  valuable  attribute  of  any 
segmenter  is  the  degree  of  distinctiveness  it  introduces  among  the 
segmented  objects  and,  in  particular,  between  targets  and  nontargets. 
We  also  pointed  out  that  we  cannot  expect  the  objects  to  be  dissimilar 
in  every  way  and,  consequently,  that  the  problem  is  really  twofold; 

a)  Determine  the  most  discriminatory  feature  or  set  of  features. 

b)  Measure  how  well  this  feature  or  set  of  features  separates  the 
objects  corresponding  to  targets  from  those  corresponding  to 
nontargets . 

We  have  defined  in  Sec.  4  the  complete  set  of  features  we  intend  to 
consider  Cor  this  purpose.  The  means  to  be  used  to  assess  the 
discriminatory  power  of  a  given  feature  will  consist  in  a  comparison  of 
the  histograms  of  that  feature  both  for  targets  and  nontargets,  and  this 
for  each  one  of  the  6  segmenters  described  in  Sec.  2.  The  total  system 
of  operations,  illustrated  in  Fig.  6,  is; 

1)  The  43  raw  images  of  the  Alabama  Data  Base  are  orocessed  in 
turn  by  all  6  segmenters. 

2)  The  segmented  images  are  passed  on  to  the  LT-extractor  and 
the  object  features  extracted  according  to  the  orocedure 
outlined  in  Sec.  5. 


UNCLASSIKIED 

57 


FIGURE  (i  -  Sv^itpai  of  noprat  lonr,  Ipadinq  to  the  determination  of  the 
tarqet  and  noritarriet  feature  IiirjtorjramB 
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3)  The  resulting  feature  arrays  (each  segmenter  gives  rise  to  43 
feature  arrays),  duly  updated,  are  stored  in  APL  files  (the 
number  of  lines  of  whatever  feature  array  is  equal  to  the 
number  of  objects  in  the  corresponding  segmented  image  and 
che  line  numbers  are  the  labels  associated  to  those  objects 
as  a  result  of  the  last  update) . 

4)  A  target  identification  array  is  assigned  to  each  APL  file. 
This  43-line  array  contains  the  labels  of  the  objects 
corresponding  to  targets.  It  is  defined  according  to  target 
location  data  collected  beforehand. 

5)  An  APL  program  automatically  determines  and  plots  the  various 
target  and  nontarget  feature  histograms. 

Being  stored  in  APL  files,  the  feature  arrays  can  be  analyzed 
interactively,  and  thus  the  above  systems  of  operations  offers  a  great 
flexibility. 

6 . 1  Object  Feature  Histograms 

All  the  histograms  presented  in  this  report  have  the  same  number 
of  bins,  namely,  20.  The  target  and  non target  histograms  of  a 
particular  feature  are  plotted  side  by  side  and  both  the  horizontal  and 
the  vertical  scales  are  the  same  for  ease  of  comparison.  The  range 
(horizontal  scale)  of  a  histogram  is  generally  that  of  the  feature 
itself  with  the  exception  of  the  area  (limited  to  400)  and  the  oerimeter 
(limited  to  200) .  The  number  of  elements  in  a  bin  (vertical  axis)  is 
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expressed  as  a  percentage  of  the  total  number  of  targets  or  nontargets 
as  the  case  may  be.  In  all  instances  where  the  numerical  value  of  a 
feature  (e.g.  thinness  ratio),  as  defined  in  Sec.  4,  is  always  less  than 
1,  this  value  is  multiplied  by  100  in  order  to  get  rid  of  fractional 
numbers. 


6.1.1  SIT  Generator 

The  SIT  Generator  manages  to  extract  (segment  from  the 
background)  73  targets  out  of  85.  Most  missed  targets  are  APC's.  The 
number  of  nontargets  generated  by  this  segmenter,  on  the  other  hand,  is 
rather  high,  that  is  1011.  It  was  included  in  the  present  study  mostly 
for  historical  reasons  (Refs.  2  and  3)  but  also  as  a  standard  by  which 
the  results  of  more  sophisticated  segmenter s  are  evaluated.  The 
histograms  arising  from  this  segmenter  made  up  Fig.  7.  There  are  13 
histogram  pairs  corresponding  to  as  many  features.  Certain  features 
listed  in  Sec.  4  were  excluded  whether  because  they  turn  out  to  be 
useless  (e.g.  skewness)  or  because  they  are  not  distinctive 
characteristics  in  themselves  (e.g.  centroid).  These  results  (as  well 
as  those  of  the  next  2  sections)  will  be  commented  further  in  a 
subsequent  section. 

6.1.2  SIT  Generator  with  BET  (Mean) 


This  segmenter  is  better  suited  for  the  task  at  hand  since  it 
extracts  83  targets  out  of  85  while  producing  only  half  as  many 
nontargets  (584)  as  the  precedent  segmenter.  The  13  histogram  pairs 
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that  sum  up  the  results  obtained  with  this  seqmenter  are  qiven  in  Fiq. 
8,  which  is  the  counterpart  of  Fiq.  7. 

6.2  Scatter  Plots 

Figures  9  and  10  show  scatter  plots  of  the  following  features; 

a)  Relative  intensity 

b)  Relative  contrast 

c)  Relative  blob  variance 

that  happen  to  be  the  most  useful  as  far  as  discrimination  between 

targets  and  non targets  is  concerned.  These  scatter  plots  might  be 

somevrtiat  misleading,  however,  for  a  plotted  point  often  corresponds  to 

more  than  one  datum.  In  other  words,  2  pairs  of  features  from  2 

segmented  objects  may  well  match  each  other  and,  consequently,  the 

relevant  objects  may  be  represented  by  a  single  point  in  the  scatter 

plots  (for  example,  there  ate  1011  nontargets  associated  with  segmenter 

No.  1  but  only  232  plotted  points  in  Fiq.  10a).  So  one  should  not 

attempt  to  draw  conclusions  based  on  the  density  of  the  points  plotted 

in  these  figures.  It  is  also  worth  mentioning  that  it  is  not  the 
2 

variance  (a  )  which  is  actually  plotted  in  Figs.  9  and  10  but  o,  the 
positive  square  root  (standard  deviation)  of  the  variance.  This  is 
equally  true  of  Figs.  7  and  8. 
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FIGURE  7  -  The  13  histogram  pairs  derived 
from  seqmenter  No.  1. 

The  associated  features  are: 

a)  Area 

b)  Perimeter 

c)  Thinness  Ratio 

d)  Relative  Intensity 

e)  Relative  Contrast 

f)  Overall  Width 

g)  Overall  Height 

h)  Width/Height  Ratio 

i)  Bulkiness 

j)  Minor  Diameter 

k)  Major  Diameter 

l)  Aspect  Ratio 

m)  Blob  Relative  Variance 
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FIGURE  8 


-  The  13  histogram  pair  derived 
From  segmenter  No.  6. 

The  associated  features  are  the  same 
as  in  Fig.  7. 
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FIGURE  9  -  This  figure,  derived  from  segmenter 
No.  1,  shows  scatter  plots  of  the 
1  following  features: 

a)  Relative  Intensity  versus  Relative 
Contrast 

b)  Relative  Intensity  versus  Relative 
Blob  Variance 

c)  Relative  Contrast  versus  Relative 

E  Blob  Variance 
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FIGURE  10  -  This  figure,  derived  from  seqmenter  No.  6, 
is  the  equivalent  of  Fig.  9, 
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6.3  Conanents 

We  will  often  refer,  in  the  remainder  of  this  section,  to 
specific  images  of  the  Alabama  Data  Base.  Readers  interested  in  viewing 
these  images  are  directed  to  Ref.  1  where  histogram-equalized  oictures 
of  the  43  images  that  made  up  the  database  are  given  along  with  the 
relevant  ground  truth. 

We  mentionned  in  Sec.  4  that  a  given  seqmenter  is  first  valued 
according  to  its  ability  to  segment  iust  about  all  the  targets  liable  to 
be  perceived  in  the  pictured  scene.  In  Table  I,  the  objects  generated 
by  all  6  segmenters,  in  relation  to  the  Alabama  Data  Base,  are  sorted  by 
object  tvpe  through  the  agency  of  the  provided  ground  truth.  As 
mentioned  before,  the  SIT  Generator  produces  the  largest  number  of 
nontargets.  This  might  be  an  indication  that  this  segnenter  is  more 
prone  to  false  alarms  than  the  others.  However,  there  are  really  no 
grounds  for  believing  that  the  number  of  false  alarms  is  generally 
directly  proportional  to  the  number  of  nontargets.  The  only  thing  we 
know  for  sure  is  that  the  nunber  of  false  alarms  will  be  equal  to  0  if 
the  nunber  of  nontargets  is  equal  to  0.  On  the  other  hand,  a  small 
nunber  of  nontargets  is  no  guarantee  of  efficiency  as  one  can  see  from 
the  figures  for  the  3CIT  Generator.  Of  the  first  3  segmenters,  the 
ISCIT  Generator  is  the  best  at  extracting  targets.  Nevertheless,  its 
extraction  rate  is  not  as  good  as  the  one  of  a  segmentation  algorithm 
incorporating,  in  one  way  or  another,  the  Background  Elimination 
Technique  described  in  detail  in  Ref.  1,  and  outlined  in  Sec.  2  of  this 
report.  This  is  then  a  good  point  for  this  technique.  On  the  sole 
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basis  of  their  extraction  rate,  2  seqrnenters,  seqmenters  No.  4  and  No. 
6,  surpass  tie  rest.  The  2  tarqets  missed  by  both  these  seqrnenters  are 
APC's:  one  in  imaqe  13  and  one  in  imaqe  41.  Inter estinqly  enough  the 
first  3  seqrnenters  do  extract  the  APC  in  image  13.  However,  the  APC  in 
image  41  eludes  all  6  extraction  schemes.  Finally/  Table  I  shows  that 
the  tank  is  probably  the  easiest  target  to  extract. 
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Table  I  does  not  qive  a  complete  picture  of  the  senmentation 
results.  It  leaves  out  the  problem  of  repeated  detections,  that  is,  of 
targets  split  up  into  several  blobs  and  hence  likely  to  be  construed  as 
forming  a  group  of  distinct  targets.  This  oroblem  is  not  too  imoortant 
as  far  as  segmenter  No.  6  is  concerned  for  the  only  multiblob  targets 
(in  Table  I  a  multiblob  target  is  classified  as  1  target  of  the 
appropriate  type)  are  the  tank  in  image  ^  and  the  bus  in  image  33,  2 
relatively  large-size  targets.  Segmenter  No.  4,  however,  is  much  more 
affected  with  nearly  10  multiblob  targets.  As  this  serious  flaw  might 
greatly  reduce  the  usefulness  of  this  segmenter,  segmenter  No.  6  emerges 
here  as  the  best.  No  multiblob  target  arises  from  the  SIT  Generator 
whereas  segmenter s  2,  3  and  5  produce  only  one  such  target. 

Another  useful  criterion  to  assess  the  practicality  of  a 
segmenter  is  the  fidelity  of  the  segmentation  process  with  regard  to  the 
geometrical  properties  of  the  targets.  From  this  point  of  view, 
segmenter  No.  1  and  No.  6  may  be  rated  as  the  best.  Segmenter  No.  4 
exhibits  a  marked  tendency  to  narrow  the  targets  (Fig.  3d)  but  this  is 
to  be  expected  since  it  is  based  on  the  HFS  images.  The  other  3 
segmenters  (2,  3  and  5)  generally  exaggerate  the  size  of  the  targets 
even  to  the  point  of,  sometimes,  merging  2  neighboring  targets  into  a 
single  blob  (in  Table  I  such  a  blob  was  classified  as  a  nontarget). 
Also,  in  a  few  instances,  although  the  target  was  not  connected  to 
another  target  blob  its  shape  was  so  distorted  as  to  be  unrecognizable. 
This  is  the  case,  for  example,  in  relation  to  segmenter  No.  5,  of  the 
tank  in  image  H  and  of  the  APC's  in  images  9  and  17.  These  distorted 
targets  were  classified  as  non targets,  in  Table  I.  The  last  3 
segmenters  in  this  table  would  otherwise  have  the  same  extraction  rate. 
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It  should  be  obvious  from  the  orecedino  oaraqraohs  that  the 
seqmenters  1  and  6  are  the  most  interestinq  ones.  This  exolains  whv, 
not  to  mention  more  prosaic  reasons,  the  or Iv  feature  histograms 
appear  inq  in  this  renort  pertain  to  these  2  SG'omr'P.ters .  Howerer ,  the 
general  conclusions  drevvn  from  Figs.  ^  to  10  aoplv  ''rcrually  well  to  a:  ‘  6 
seqmenters . 

3y  examining  Figs.  and  3,  v;e  roadilv  concl  ;ae  that  onlv  ;■  o.;t 
of  the  13  plotted  features  are  reallv  Peculiar  t)  a  tarqet  blot.  Tnc-so 
are: 


a)  Relative  Intensity 

b)  Relative  Contrast 

c)  Relative  Dlob  Varianro 

:t  turns  o^rt  ouite  naturally  that  these  ere  re^er'ue  features,  Inu?  .  , 
this  is  the  onl-’  war  to  oliminaee  oe:  er  :e  rits  i-’Poseu  :  r  " 

experimental  conditions  and  also  ov  tne  fact  that  wo  are  dealing  witn  a 
discontinuous  sequence  of  pictures.  Althouqh  the  other  features  are  no 
good  at  discriminatinq  targets  from  nontarqets,  thev  might  well  be  verv 


useful 

to  classifv  th"^- 

targets  themselves . 

^  j  ow 

:'-vor ,  this  is  sometnin: 

v.o  will 

!.  not  atf-'-mPl  to 

■'  in  tile  prosc'it  r^'r 

V'lt  . 

u.  not  surer  i 1  :'u 

that 

intensity  and 

contrast  feature-- 

are 

distinquish inq  t-sr 

characteristics  since  we  are  dealing  with  IR  imagery.  Nevertheless,  it 
is  amazing  to  observe  that  so  is  doing  the  variance.  Given  the  size  of 
the  targets,  we  would  rather  intuitively  expect  the  variance  to  be 
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insignificant,  but  the  experimental  results  show  that  this  is  not  the 
case.  Another  important  aspect  should  be  emphasized  here.  In  Fig. 
the  numerical  values  assigned  to  the  various  features  were  derived  from 
the  original  raw  images,  v^ereas  in  Fig.  8  these  numerical  values 
originate  from  the  Mean  Fine  Structure  images,  that  is,  images  that  bear 
little  resemblance  to  the  original  ones.  So  one  may  wonder,  in  the  case 
of  segmenter  No.  6,  what  the  feature  values  would  have  been  had  their 
evaluation  been  based  on  the  original  images.  It  makes  no  difference 
for  shape  features  (e.q.  area,  oerimeter,  overall  width,  etc.)  but  this 
should  normally  affect  moment  features  such  as  the  intensity,  contrast, 
minor  diamer.er  etc.  that  depend  on  the  gray  level  of  the  oixels 
involved.  Fig.  11  is  meant  to  elucidate  the  question,  '^s  we  can  see, 
the  moment  features  in  Figs.  8  and  11  exhibit  the  same  trends  exceot  for 
the  relative  intensity  that  is  obviously  not  a  distinctive  target 
feature  when  evaluated  from  the  original  images.  There  is  then  no  ooint 
in  going  back  to  the  original  images  insofar  as  segmenter  No.  6  is 
concerned.  This,  in  fact,  confirms  that  BET  saves  all  the  useful 
information  about  the  targets. 

Once  features  peculiarly  belonging  to  the  targets  have  been 
identified,  one  can  assess  the  degree  of  distinctiveness  imparted  to  the 
targets  as  opposed  to  the  nontargets.  That  Quantity  is  proportional  to 
the  extent  of  overlap  of  the  relevant  pair  of  histograms  (Figs.  ’’  and 
8),  and  then  can  be  determined  accordingly.  However,  it  serves  our 
purpose  better  to  give  here  some  examples  of  the  classification  results 
one  can  expect  from  the  aforementioned  subset  of  3  features.  To  this 
end,  the  confusion  matrix: 
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tarqet  miss 

false  alarm  nontarget 

has  been  determined  for  various  contoinations  of  the  3  features.  The 
decision  rule  for  each  feature  (the  t's  in  Fig.  12)  is  simply  a  fixed 
threshold  whose  level  corresponds  to  the  5th  percentile  of  the  feature's 
target  histogram.  Hence,  all  the  objects  associated  with  a  feature 
whose  value  is  greater  than  the  specified  threshold  are  discarded  as 
nontarqets  (Fig.  12).  The  classification  results  that  ensue  for  each 
feature  alone,  for  a  combination  of  2  out  of  the  3  features,  and  for  all 
3  features  together  are  shown  in  Table  II.  In  this  table,  these 
features  are  identified  as  follows  (Sec.  4) :  6:  Relative  Intensity; 
Relative  Contrast;  and  17:  Relative  Blob  Variance.  It  is  important  to 
note  that  the  order  of  the  features  in  a  combination  is  not  immaterial 
for,  given  the  structure  of  the  decision  tree  (Fig.  12),  the  results  are 
not  necessarily  the  same  if  the  features  undergo  a  permutation.  Also, 
for  the  same  reason,  the  probability  of  detection  (number  of  targets 
classified  as  such)  of  any  combination  of  features  cannot  exceed  that  of 
its  least  effective  member.  However,  by  combining  features  one  can 
ireatly  reduce  the  number  of  false  alarms.  To  convince  oneself  that 
this  is  indeed  the  case  it  suffices  to  compare  the  confusion  matrix  for 
feature  ’’  (Fable  II)  to  that  for  features  and  17.  Clearly,  a 

trade-off  has  to  oe  made  between  the  detection  rate  one  would  like  to 
obtain  and  the  false  alarm  rate  that  can  be  tolerated.  In  any  way, 
Table  II  shows  that  it  should  be  oossible  to  obtain  with  segmenter  No.  6 
a  detection  rate  in  excess  of  90%  with  a  false  alarm  rate  not  greater 
than  3%. 
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A  classic ication  experiment  using  the  Fisher  linear  discriminant 
(Ref.  5)  was  attempted  on  the  scatter  plots  (Figs.  9  and  10)  but  the 
results  merely  point  out  that  one  is  entitled  to  use  the  features 
independently.  The  Fisher  linear  discriminant  attempts  to  find  the 
optimum  linear  projection  of  the  feature  vectors  onto  a  line,  and  the 
optimum  partitioi  of  this  line,  such  that  the  ratio  of  between-class 
scatter  to  within-class  scatter  is  maximized  (Ref.  6) . 
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FIGURE  11  -  The  6  histogram  pairs  derived  from 
segmenter  No.  6.  The 
associated  features  listed  below 
were  evaluated  from  the  original 
raw  images  instead  of  the  Mean 
Fine  Structure  images: 

a)  Relative  Intensity 

b)  Relative  Contrast 

c)  Minor  Diameter 

d)  Major  Diameter 

e)  Aspect  Ratio 

f)  Blob  Relative  Variance 
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7.0  CONCLUSION 

The  purpose  of  the  present  report  was  to  evaluate  the  oerformance 
of  6  different  segmentation  algorithms  or  segmenters  based  on  the  single 
assumption  that  the  targets  present  a  larger  thermal  image  than  the 
background.  The  first  3  segmenters  considered  deal  with  an  image  in  its 
entirety#  whereas  the  last  3  incorporate  a  technique,  the  Background 
Elimination  Technique  or  BET,  vAiich  aims  at  eliminating  wholly  or  partly 
the  background.  The  segmenters  are  judged  according  to: 

a)  their  extraction  rate; 

b)  the  fidelity  of  the  segmentation  with  respect  to  the 
geometrical  properties  of  the  extracted  targets; 

c)  the  degree  of  distinctiveness  inparted  to  the 
extracted  targets  as  opposed  to  the  nontargets. 

The  3  segmenters  relying  on  BET  have  a  better  extraction  rate  than  the 
other  3  that  try  to  cope  with  the  background  simply  by  oartitioning  the 
image.  Most  segmenters  here  distort  in  one  way  or  another  the  shape  of 
the  targets.  The  2  exceptions  are  segmenters  No.  1  (Single  Intensity 
Threshold  Generator  or  SIT  Generator)  and  No.  6  (SIT  Generator  in 
conjunction  with  BET  through  the  Mean  Fine  Structure  image) .  To 
determine  the  degree  of  distinctiveness,  one  must  first  single  out  the 
feature  or  set  of  features  that  most  characterizes  the  targets.  The 
experimental  results  show  that  the  following  relative  features  are  the 
best  for  this  purpose: 
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a)  Relative  Intensity 

b)  Relative  Contrast 

c)  Relative  Blob  Variance 

It  is  not  surprisinq  to  note  that  intensity  and  contrast  features  are 
distinquishinq  tarqet  characteristics  since  we  are  dealinq  with  IR 
imagery.  It  is  amazinq,  however,  to  observe  that  so  is  doinq  the 
variance  feature.  The  classification  results  one  can  expect  from  these 
features  together  with  the  segmenter  that  proves  to  be  the  oest 
(segmenter  No.  6)  amount  to  a  detection  rate  in  excess  of  90^1  with  a 
false  alarm  rate  not  greater  than  3%.  On  the  other  hand,  it  should  be 
possible  to  refine  further  that  segmenter  to  bring  the  extraction  rate 
up  to  100%,  although  97%  is  a  percentage  already  quite  acceptable. 
However,  the  next  thing  to  do  would  rather  be  to  test  segmenter  No.  6  on 
more  conplex  imagery. 
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